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We extend known saddlepoint tail probability approximations to multivariate cases, including 
multivariate conditional cases. Our approximation applies to both continuous and lattice vari- 
ables, and requires the existence of a cumulant generating function. The method is applied to 
some examples, including a real data set from a case-control study of endometrial cancer. The 
method contains less terms and is easier to implement than existing methods, while showing an 
accuracy comparable to those methods. 
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1. Introduction 

Let Xi,X2,...,X„ be independent and identically distributed random vectors from a 
density /x(-) on R d . We construct an accurate multivariate saddlepoint approximation 
of the tail probability of the mean random vector X = (Xi + X2 + • • • + X„)/n. We also 
develop a similar approximation for conditional tail probabilities. The approximation has 
a relative error of 0(n~ 1 ), uniformly over a compact set of x, a realization of X, under 
some general conditions. Our method utilizes the likelihood ratio statistic, routinely 
calculated by standard software, which makes the approximation easy to implement. 

The Edgcworth expansion is a natural competitor to the saddlepoint approximation. 
This expansion has a uniformly bounded absolute error and works well in the center 
of the distribution being approximated. However, the approximation deteriorates at the 
far tail of the distribution, where it can sometimes even attain negative values. [1] first 
applied saddlepoint techniques to the approximation of a probability density function. 
Saddlepoint approximation addresses the problem of degradation outside a region of 
radius 0{n~ 1 / 2 ) about -E(Xj) by bounding the relative error, rather than the absolute 
error, of the approximation over the admissible range. 

[1] discussed approximating the density of X when the dimension d=l, that is, the 
univariate case. The approximation achieved a relative error of 0(n _1 ) uniformly over 
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the whole admissible range of the variable, under some conditions. The method uses the 
Fourier inversion formula, which involves moment generating, or characteristic, functions 
and complex integration. In this approach, the path of integration is shifted so that it 
passes through the saddlepoint of the integrand and follows the steepest descent curve 
at the neighborhood of the saddlepoint. The asymptotic property follows from a lemma 
due to [14]. 

Extensions of univariate saddlepoint approximation of tail probabilities P(X > x) for 
the means of independent random variables have also been studied. This calculation is 
more difficult, in that, unlike the density function case, the integrand of the Fourier 
inversion integral for tail probabilities has a pole at zero. 

[10] presented a general saddlepoint approximation technique that can be applied to tail 
probability approximation, based on Laplace approximation of the integrated saddlepoint 
density, with an error of 0(n _1 ). Robinson used an argument involving a conjugate 
exponentially shifted distribution family and the Edgcworth expansion. The terms of 
the expansion can then be integrated termwise. There is no direct explicit formula for 
the integration of each term, but the terms may be computed recursively. This method 
applies when x > E(X). When x < E(X), Boole's law and reflection of the distribution 
must be used. 

[8] provided an alternative approximation. [2] derived this technique, using a trans- 
formation of variables to directly address the local quadratic behavior of the numerator 
exponent. The integral is then split into two parts, one which contains a pole, but can be 
integrated exactly and explicitly, and the other which only has removable singularities 
and can be expanded and approximated accurately. The virtue of this method is that 
the approximation is compact and can be computed without recursion, and the formula 
is valid over the whole range of admissible x. 

[9] thoroughly discussed the usefulness of the saddlepoint method in a review of the 
method focusing on a variety of applications to statistical inference. 

[5] generalized the univariate Robinson approach under the Daniels framework and 
achieved an error of size 0(rt _1 ). The method uses integral expressions for the tail prob- 
ability in the multivariate case and presents a multivariate expansion of the numerator 
of the integrand and a termwise multivariate integration using recursion. This approach 
shares the drawback of Robinson's approach in that it requires a positivity constraint on 
the ordinate. 

[13] generalized Lugannani and Rice's method to the case of a bivariatc probability 
distribution function using variable transformations. [5] used a different method of proof 
and showed that the error term is of order 0(n _1 ); his method is limited to d = 2. 
Furthermore, Wang's development involves an inversion integral in which the pole of one 
variable depends on the values of other variables in a fundamentally nonlinear way. 

Wang's proof of the error rate in the neighborhood of the pole is incomplete. In this 
paper, a way of effectively extending Lugannani and Rice's method to the multivariate 
case, which uses a different transformation formula from Wang's and can be used in the 
case d > 2, is proposed. The method uses fewer terms and can be extended to multivariate 
conditional cases. 

Our proposed saddlepoint approximation may be used to test null and alternative 
hypotheses concerning a multivariate parameter when the hypotheses are specified by 
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systems of linear inequalities. [6] applied the method of [5], in conjunction with the 
adjusted profile likelihood, in such a case. For instance, [6] refers to data presented by [12] 
on 63 case-control pairs of women with endometrial cancer. The occurrence of endometrial 
cancer is influenced by explanatory variables including gall bladder disease, hypertension 
and non-estrogen drug use. The test of whether hypertension or non-cstrogen drug use is 
associated with an increase in endometrial cancer will be performed, conditional on the 
sufficient statistic value associated with gall bladder disease. 

The remainder of the paper is organized as follows. Section 2 provides the unified frame- 
work under which both unconditional and conditional tail probability approximations are 
considered. Section 3 derives formulas for multivariate unconditional distributions. Sec- 
tion 4 focuses on conditional distributions. Section 5 presents five examples and shows 
the approximation results. 

2. Multivariate extension 

The unconditional and conditional tail probability approximation share some common 
characteristics. We derive them in a unified way. Applying the Fourier inversion theorem 
and Fubini's theorem, as in [5], we find that both the unconditional and conditional tail 
probability approximations require evaluation of an integral of the form 



where K is the cumulant generating function, which is the natural logarithm of the 
moment generating function, and c is any positive d-dimcnsional vector. This will be 
discussed in Section 4. In the unconditional case, for continuous variables, K is a vector 
of length d, with every entry infinity, t* = t and p(r) = r; for unit lattice, K is a vector 
of length d, with every entry n, t* is t corrected for continuity, p(r) = 2sinh(r/2) and 
d = do. In the conditional case, the setting is the same, except that do equals d minus 
the dimension of the conditioning variables. 

[2] recast a great deal of the saddlepoint literature in terms of inversion integrals of the 
form (2.1), rescaled so that the exponent is exactly quadratic. This rescaling includes the 
multiplier for the linear term in the exponent; this linear term is the signed root of the 
likelihood ratio statistic. The idea of using the modified signed likelihood ratio statistic 
was proposed in [3] . [4] defines a multivariate version of this reparameterization and also 
defines the multiplier for the linear terms; again, these are signed roots of likelihood ratio 
statistics, but, this time, for a sequence of nested models: 




(2.1) 



-w T w = min(K(7) — 7 T t*) 



and 



--(w- w) T (w - w) = K(t) - T T t* - min^ (7) - 7 T t*). 
2 7 
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Further specification of w and w is needed. For any vector v of length d, let Vj 
be the vector consisting of the first j elements, that is, (i>i, 1)2, ■ ■ ■ , Vj) T . For instance, 
7j = (71, 72, ■ • • , 7j) T , Tj = (ti,T2, ■ . . ,Tj) and 3 is the zero vector (0,0,..., 0) T with 
dimension j. Let v_ 3 - be the vector consisting all but the first j elements of v, that is, 
(vj+i,Vj + 2, . . . , Vd) T - [4], Chapter 6 defines w and w using 

-\w 2 = min (A( 7 )- 7 T t*)- min (A( 7 )- 7 T t*), (2.2a) 
--(Wj -Wj) 2 = min (K(~f) -7 T t*) - min (A' (7) - 7 T t*). (2.2b) 

2 7iT;f-l=T3-l 7,7j=T"j 

This definition is not invariant with regard to the order of the coordinates. Also, note 
that Wj is a function of only Tj, but not of any element of T-j Vj. The same holds true 
for Tj as a function of w. 

We now construct more explicit formulas for w and w. Let 

Tj hj ) = (71: 72, • • • , 7j , (rfj ) , fj+2 (rfj), ■ ■ ■ , Td hi ) ) 

be the minimizer of (K(-f) — 7 T t*) when the first j variables are fixed. The function 
fkifj) above is the minimizer for variable k when the first j variables are fixed, for 
k>j. 

Using the above notation, the definitions of w and w can be rewritten as 

- ii^=A-(T 3 -_i(0 3 -_i))-T 3 -_i(0 3 -_i) T t* - (K(fj(0j)) - r 3 (0 3 ) T t*), (2.3a) 
\( W j - wj) 2 = K{r j—\{r j—\)) - f 3 _ 1 (r 3 _ 1 ) T t* - (A-(f 3 (r 3 )) - f 3 (r 3 ) T t*), (2.3b) 



2 



where t 3 _i(-) is set to r when j = 1 for succinctness of expression. 

By choosing a sign to make w and w increasing functions of f and r, we can further 
specify them as follows: 

Wj =sign(f 3 (0 3 -_i)) 

(2.4a) 

x 2[A'(Tj_i(0j_i)) — Tj_i(0j_i) T t* - (^(^(0,)) -TjiOjVt*)], 

Wj = Wj + sign(r 3 - f 3 (r^-i)) 

(2.4b) 



x v /-2[A(r 3 _ 1 (r J _ 1 )) - fj^Tj^t* - (A(r,(r 3 )) - T^Ft*)]. 

The derivation of the [8] approximation provided by [2] requires identification of the 
simple pole in the inversion integrand. We need to match zeros in the denominator of the 
multivariate integrand with functions of the variables in the new parameterization; the 
points at which this matching occurs will be denoted by a tilde. The quantities above, 
such as t, w, Tj(tj_i) and functional relationships between t and w, etcetera, can be 
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solved numerically by Newton-Raphson methods, or even analytically in some cases. 
Finally, we define a function Wjfaj-i) such that Tj(wi,W2, ■ ■ ■ , u>j(Wj_i)) = for j > 1. 
It can be verified that the following properties hold: 



Tj = if and only if Wj = 0; 
w j (0 j - 1 ) = forj>l; 
Tj = fj(Tj—i) if and only if Wj = Wj 

Tj = Tj if and only if w, = Wj 



for j > 1; 



(2.5a) 
(2.5b) 
(2.5c) 
(2.5d) 



Below, the superscript of a function denotes differentiation with respect to the corre- 
sponding argument of the function. We will employ the same use of superscripts in 
the subsequent text of the paper, except that when the superscript is a set, it de- 
notes difference, as defined at the end of this section. Also, a superscripted "T" de- 
notes the transpose of matrix. We can obtain Wj = Wj^Wj-i) and wq = w'(wj_i), 
which will be used in later sections. Substituting Wj = Wj, Tj = 0, Tj_i = Tj-i and 
Tj = (fi,f 2 , . . . ,fj-i, 0) T = (Tj_i,0) T into (2.4b), we obtain 



^+sign(0-f,) v /-2[^(f) 



t T t* 



(KiTjifj^O))-^-!,^*)]. (2.6) 
Differentiating (2.3b) with respect to Wk and rearranging terms, we obtain 



W: 



l=k 



TAT 



0)) 



(2.7) 



for k < j . The derivatives evaluated at the point w; can be obtained by differentiating 
(2.3b) with respect to Wk once or twice, depending on whether or not Wj = Wj , and solving 
the resulting system of equations. In particular, we are interested in 



dw.j 



if W.i = 



Wj, 



Wj 



■ Wj 



(2.8) 



KKrj{r 3 ))-ty 
for j <do, where denotes the first j elements, and 



if Wj Wj , 



TT 

11 dw, 

j=d + l ■> 



= n 

(w dQ ,w_ tio ) j-d a + l 



(2.9) 



Er=i^ J ' , (T*(r*)K([T*(r < , )] J ) 

where, for succinctness of expression, we define if(-) to be 1 when I — j. For Z > j, we 
obtain (■) by differentiating both sides of the definition of if (•), that is, K l (-) = t* with 
respect to Tj VZ > j, and solving the system of equations. 
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Under this transformation of variables from t to w, the Jacobian is just the product 
of the diagonal terms of the Jacobian matrix and (2.1) can be expressed as 



n d-d ,W 


fiK exp( 


(27ti) d U_ 


iK 


n d-d 




(2ni) d 


J w— iK 


n d-d 


/*w+iK 

/ 


(2ni) d 


J w— iK 


where 





ns*w 



exp(rt[(l/2)w T w - w T w]) -pj dr 3 Uf=i ( w j - ">j(wj-i)) 



l y d LV 1 ' ~G{t) dw, 



n?iHj-^(wM)) A d Tj dr. 



(w dQ ,w_d ) 



and, to simplify notation, we set ?Dj(wj_i) to zero for j = 1. For later convenience, we 
write G(t) as a function r instead of w. The relation ~ in the last step indicates exact 
equality in the unconditional case, where d = do, but holds with a relative error of 0(n _1 ) 
in the conditional case, which we will discuss in Section 4. Hereafter, we use ~ to denote 
approximation with a relative error of 0(?i _1 ) of both the left-hand side and the tail 
probability, and we use ~ (~ with a dot above it) in the case where the right-hand side 
is an approximation with a relative error of (^(n^ 1 / 2 ) of the left-hand side. 

The last integral in (2.10) will be evaluated by splitting it into rather simple terms 
involving poles and more complicated terms involving analytic functions. We can decom- 
pose (2.10) into 2 d ° terms. Let U= {1, 2, . . . , do} be the index set of integers from 1 to 
do. For set s C U, define G s (t) = G(t s ), where the vector t s is defined by 



3 ' 



if j G s, 
0, if j<£s. 

For example, if d = 3, then G^{t) = G(n,r 2 ,0). Now, for t C U, define H l = 
J2«ct( — 1)'*~ s 'G s (t), where | • | denotes the cardinality, that is, the number of el- 
ements of a set. For example, = G^(r) - G^(t) - G^{t) + G {t) = 
G(ti,t 2 ,0) - G(ti,0,0) - G(0,r 2 ,0) + G(0,0,0), where denotes the empty set. We 
conclude that G(t) = Y^tcu^- This decomposition holds by induction on do- Noting 
that Vs C U and a e s, H s (r {a} ) = 0, we see that 

H*(t) 



is analytic. In other words, \t\ product terms in the denominator of the integrand in (2.10) 
are 'absorbed' by i?'(r), leaving the remaining (do — \t\) product terms unabsorbed. As 
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explained in [5], each term that is absorbed contributes a relative error of 0(n -1 / 2 ). 
Therefore, if we let /* be the integral corresponding to H f , then we obtain 

r^ exp( [(1/2)wTw _^ Tw]) ^ 

(2my n * i {wj _ ^,(w,_i)) ltl ^ tQU 

In the next two sections, we compute the /*, \t\ < 1, t C U, for distribution and conditional 
distribution, respectively. 1 



3. Multivariate distribution approximation 

In the unconditional continuous case, we have d = do and 

C(T] ntiK -^K-i)) ^ dr 3 

in :-w. y ) f = \d Wj - 



Therefore, 



r = ^ ^(f/Vw-w^l) w 

■i-jao 



(27ri)*^_ ioo n*i(»i-%(w 3 --i)) 

1 r +[co exp(n[(l/2)w T w - w T w]) 



(3-1) 



(27ri)*> J^_ ioQ ^f =1 ( Wj - Wj (W,-!)) 



dw 



since G(0) = by properties (2.5a) and (2.5b). 

Let Uj = Wj —Wj(\Vj_i), u be such that w(u) = u and g(u) = iw(u) T w(u) — w T w(u). 
By changing variables, with Jacobian equal to 1, we have 

^e X p(n[ 5 (u)]) du> 



(27ti)* n d ° 



3 = 1 J 



The integration in (3.2) cannot be integrated out exactly in general. However, using 
the same argument as in [5], we approximate it by expanding g(u) about u up to the 
third degree; after termwise integration, the resulting approximation to I has relative 
error 0(n -1 ). So, I can be approximated by 



I = 



exp(n[<? + (l/2)g : > k (u j - Uj){u k - u k )}) 



(2m) d o y fl _ ioo Ylfu 



1 More detailed derivations and formulae for bivariate distributions can be found at 
http : //stat .rutgers . edu/resources/technical_reportslO .html . 
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1 + ^9 Jkl (uj - u-j){u k - u k )(ui - ii;)J du 
1 r +io ° exp(n[g + (l/2)g^ k { Uj - u^Uk - u k )\) 



p IO ° exp^[g + (l/2)ff^( Mj -%)K-^)]) ^ ^ 

J u — i oo FT,, — i Uj 



(2m) d ° J & _ ioo U%i'^ 

1 /•" +io ° exp(n[g + (l/2)^ fe ( % - - uj)(u k - «*)]) 



(27ti) d o 7 ]i _. co J]* i 



x -g jkl (uj - u.j){u k - u k )(ui - ui)du, 

where, for brevity, we write g r for g r (u). All derivatives of g evaluated at u can be 
computed and, in particular, g J = 0. Here, we use tensor notation, that is, the use of 
superscripts and subscripts to denote summation over all possible combinations, by which 
we are able to omit the summation symbol. The computation of the second integral is 
addressed in [5] . The details involve partial derivatives of some functions up to the second 
or third degree; these are algebraically complicated and therefore omitted here. For the 
first integral, rearrange the terms in the numerator in the order of the degree of u. The 
first integral is quadratic and can be computed as 

1 exp(n[g + (l/2)ff*( u 3 - -flj)(u fc -fl fc )]) du 



d ° u- 
=i u j 



exp(n[(g + (l/2)g^ Uj u k ) - g> k u k u 3 + (l/2)g^u jUfc ]) ^ 



= C #(y ,£ ), 

where C = exp(n[g + ^g^iijUk)}), y is a vector whose jih element is \fng^ k u k j y/ gii 
and $ is the tail probability of a standard multivariate normal distribution with mean 
and covariance matrix X with elements g^ k / yg 7 ^ g kk . The last of the above equations 
can be obtained by changing variables to v, where Vj = Uj / \J gii . 
For 7*, t = r, we have 

/W = 1 r + - ^[(l/Vw-w^w]) . GM( T ) G(0 ) ^ 

We perform a similar change of variable from w to u as in computing I , except that 
u r = w r . We then have 

J M = _i_/ a+i00 exp(n[ g M (u)]) 

(27ti)^o y G _ ioo n u 

(3.6) 

1 r +i °° exp(n[c> } + (l/2)(u - u) T c C M (u - (j) - U M( U - (j))]) 
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where g^(u) is the exponent as a function of u after the change of variable, (u) = 

G w -ib\-w Gt *i) ■ c oo^ = cc^ is the matrix with elements ccf^ — \g^\ 13 (u) and 

is the vector such that cf^ = [g^ r ^] l (u). We can perform a further change of variables 

I x r \ 

Vj = \/ny Cjj Uj so that 

H^JfiL. r~ ° P((1 ^;f' V 7" V) 4-'Wdv, (3.7, 
v v 

where = exp(n[cQ^ + ^u T cc^ r J"u]), S| r ^ is the covariance matrix with elements 
pM] y =c W/y^~W and t - v] {r} = v^Wu], _ The functiQn ft W (v) . s analytiCi 



but ^ ^ is not analytic, and we cannot use Watson's lemma directly. We use the 

following technique. Let t r = [S^ r ^v] r and tj = yl — ([e£' ^rj) 2 "^" for j 7^ r. Perform a 
change of variables to obtain 

where Q M (t) = ±t T Il| r} t-y| r} t, here S^ r} being the matrix with elements [Sf" } ] r j = 
for j ^ r, 

y^ being the vector with elements 

r-Mi r-,Wi j r-Wi [yi^lj - [sj r} ]rj[yi r} ]r , . , 

[y t ] r = [yi ]r and [yt L,- = , f . = — for j ^ r. 

For a set s, let t s denote the vector such that [t s ]k = if k £ s and [t s ]k = tk if k E s. 
We have 



7 W.^L exp(QM(t)) {r} 



(3.9) 



The argument that the above holds follows similar reasoning as in (2.11), except that we 
only need to consider the main term here. Now, because t r can be separated after the 
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change of variable and by Watson's lemma, we have 




cxp(QM( t )-((l/2)^-[yW] r ^) 



pt r -\-'\oo 
t r — ioo 



exp((l/2)*2 




/i t W (t M )dt 



(3.10) 



27ti 




where y^ is y| with the rth element removed and is s| with the rth row and 
column removed. 

Multivariate tail probability approximations for unit lattice variables follow along the 
same lines, except that 



Since lim^-j.o (2sinh(x/2)/x) = 1, any analytic property in the continuous case still holds 
in the lattice case. 

4. Multivariate conditional distribution 
approximation 

Consider a multivariate canonical exponential family. In practice, we are often only in- 
terested in a subset of the parameters in a given statistical model, with the other model 
parameters usually treated as nuisance parameters. The distribution of the sufficient 
statistics associated with the parameters of interest, conditional on the sufficient statis- 
tics associated with the nuisance parameters, contains the parameters of interest and not 
the nuisance parameters. We can therefore use the conditional distributions instead of 
the original distribution in the study. For instance, in testing equality of proportions for 
a 2 x 2 contingency table, we condition on the row or column margins. Another example 
is logistic regression, where inference on some regression parameters is often performed 
conditionally on sufficient statistics associated with nuisance parameters. 

Hypotheses involving parameters of interest may be tested by computing the tail prob- 
abilities for the conditional distribution P(T<j ^tdjT—^ =t_ < j ). [11] applies double 
saddlcpoint approximation to the problem in the case where do = 1, d > 1 and T is the 
mean of independent and identically distributed random vectors. Here, we propose a 
method that extends the results to do > 1 and d > do , using the idea from the previous 
sections. 



G(r) 



n-l 1 2sinh(r J /2)(w J ) 
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First, consider T, the mean of independent and identically distributed continuous 
random vectors. Then 



P(T do >t do |T_ do =t_ do ) 



Jt7 fAvi, ■ ■ • ,yd ,td +i, ■ ■ ■ ,td) dyd 
h_ d0 (t-d ) 



where /t(') is the joint density and fT_ d (■) is the marginal density of T_d . Again, we 
use the Fourier inversion formula to obtain 



P(T do >t do |T_ do =t_ do ) 



,d— dn /-c+ioo 



(27ti) d 



exp(n[if(r) - r T t]) 



dr// T _ do (t_ do ), (4.1) 



where K(t) is the cumulant generating function of the random vector T. The numerator 
is just a special case of (2.1). 

Approximation (2.10) holds because of the following lemma which will allow us to 
apply previous unconditional results by substituting components of w for components of 
w when the components correspond to variables in the conditioning event. 



Lemma 4.1. 



w+iK 



cxp(n[(l/2)w T w - w T w]) d Tj U%i K - ">j(wj-i)) 



(2™) d U_ iK U f U {Wj - ^(w^)) f = \ <K" n?li P(^(w,)) 



■ dw 



^d—dn /-w+iK 



exp(n[(l/2)w T w — w T w] 
(2m)^_ iK fl£ (^-^(w^-i)) 



G(r)dw(l + 0(n- 1 )), 



w/ie 



g (t) = — — ; 11 x- 



Proof. By Watson's lemma, given fixed , we have 

w_ do +iK ( |--^ 

2 wT d w -do - w T do w_ do 



(w dQ ,w_ do ) 



exp n 



dr- 

J - J - dui,- 

j=d +i ■> 



w_ do +iK 

exp[ n 

•w-dn— iK 

1 + £(w*J 



dw_ do 



TT ^ 

11 dw, 



(4.2) 



1202 



J. Kolassa and J. Li 



for some analytic function E(wd a ) of 0(1). Therefore, 



LHS 



{2m) d 



da f * d0+ iK exp(n[(l/2)wj n w (i0 - WjwJ) *> dr^ 

11 dm, 



w do -iK 



r-w_,i +iK 

: / exp| n 

'w_ do iK 



TT ^ 

11 din; 



(w tio ,w_ do ) 



A 1 



where 



and 



B 



1 £ 

n A 

n d-d /•^+ iK exp ( n [(l/2)w T w- w T wl)„ ; N1 
(27n)^_ iK nJiK-^'^-O) 

n d-d /-w+iK Cxp ( n [( 1 / 2 ) W T W _ ^T w 



G(T)£(w dD )dw. 



If A and i? are expanded according to [5], each integral is approximated by a tilting 
term times a normal multivariate tail probability, up to relative order Q(\/yJn). The 
expression for B is also multiplied by the leading term of E. Hence, A/B ~ 0(1) and, 
therefore, the left-hand side equals A(l + 0(n -1 )). □ 



To deal with the denominator in (4.1), [7] demonstrates that 



27ti 



exp n 



w_ dn -ioo 



a A 

n ^ 



i=do + l 



dw, 



dw_ do 



(0d ,w-d ) 



/ T _ (io (t- do )(l + 0(r i - 1 )). 



(4.3) 



This development is similar to that of [4], page 147. 

With continuous variables, we can decompose A according to (2.11) with 



G(x)=n^^ (w ^ i)dT: 



dw. 



TT 

11 H-in. 



i=d + l 



(w do ,w_ do ) 



(4.4) 
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Denote the left-hand side of (4.3) by J_ do . Note that G(0) = U^ =do+1 ^-\{o da ,^^ do ) 
The main term is then 



w+ioo 



exp(n[(l/2)w w — w w]) 
(27ti)^n 3 t 1 K'-^(w J -_i)) (27ri) 

exp(n[(l/2)w T w — w T w]) 



,d-d 



TT ^ 

j=d + l J 



dw 



(0 



d ,™-d ) 



w-ioo (27ti)*> EI -=lK' - ^(w,-l)) 

w+io ° exp(n[(l/2)w T w- w T w]) 
w-ioo (2m) rf ° n^iK - «5j(Wj_l)) 



dw do • / T _ do (t_ do ), 



where 



exp(n[(l/2) 



W W — W W 



/w-ioo (27Ti)* n"ilK - %(Wj-l)) 

can be obtained by formula (3.4). 

Using the same technique as in (3.5)-(3.10), we have 



dw do 



1 d-d /-w+ioo 



{2naiy 



exp(n[(l/2)w T w - w T w]) G^(t) - G(0) 



dw 



\l nc{{ } Ylj=do+i d7 "j/ du, il(o,i ,*- do ) 



;[y t M ],)*(y w ,s M )- J- d0 



at 0(?i J ). The computation involves n^+iJFlft 
using (2.9). 

In summary, in the conditional case, P(T do > t do |T_ do > t_ rfo ) ~ Yl\s\<x,sCU IS 7 7-do (*-*>)) 
where J7 = {1,2, . . . , do}- 



(4.5) 



(4.6) 



(4.7) 



(lUl,Ul2,W_ 2 ) 5 



which can be obtained 



Table 1. Results of saddlepoint approximation compared with other approxima- 
tions in the continuous case 



j/i y2 P. approx. K. approx. N. approx. Exact Relative error 



2.5 


2.5 


9.12 x 10" 


-■> 


8.98 x 10" 


2 


9.65 X 10" 


2 


9.22 x 10" 


-2 


-1.08% 


2.5 


3.5 


1.41 x 10" 


-2 


1.41 x 10" 


2 


6.54 x 10" 


3 


1.41 x 10" 


-2 


0.00% 


2.5 


4.0 


3.91 x 10" 


-3 


3.99 x 10" 


3 


6.69 x 10" 


3 


3.93 x 10" 


-3 


-0.51% 


3.0 


3.0 


2.20 x 10" 


-2 


2.14 x 10" 


2 


1.46 x 10" 


2 


2.22 x 10 


-2 


-0.90% 


3.0 


3.5 


8.97 x 10" 


-3 


8.73 x 10" 


3 


3.52 x 10" 


3 


8.96 x 10 


-3 


0.11% 


3.5 


3.5 


4.40 x 10" 


-3 


4.25 x 10" 


3 


1.09 x 10" 


3 


4.40 x 10 


-3 


0.00% 


3.5 


4.0 


1.67 x 10" 


-3 


1.61 x 10" 


3 


1.78 x 10" 


4 


1.66 x 10" 


-3 


0.60% 


4.0 


4.0 


7.69 x 10" 


-4 


7.34 x 10" 


4 


3.88 x 10" 


5 


7.58 x 10" 


-4 


1.45% 
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Table 2. Results of saddlepoint approximation compared with other approxima- 
tions in the unit lattice case 





J/2 


P. approx 




K. approx. 


N. approx. 


Exact 




Relative error 


4.5 


4.r> 


1.15 x 10" 


i 


1.16 x 10" 


-i 


1.16 x 10" 


i 


1.15 x 10" 


-i 


0.00% 


4.5 


5.0 


4.43 x 10" 


2 


4.51 x 10" 


-2 


4.28 x 10" 


2 


4.44 x 10" 


-2 


-0.23% 


4.5 


5.5 


1.04 x 10" 


2 


1.05 x 10" 


-2 


8.73 x 10" 


3 


1.04 x 10" 


-2 


0.00% 


4.5 


6.0 


1.46 x 10" 


3 


1.45 x 10" 


-3 


9.50 x 10" 


4 


1.46 x 10" 


-3 


0.00% 


5.0 


5.0 


2.07 x 10" 


2 


2.12 x 10" 


2 


1.92 x 10" 


2 


2.08 x 10" 


-2 


-0.48% 


5.0 


5.5 


5.89 x 10" 


3 


6.04 x 10" 


-3 


4.85 x 10" 


3 


5.91 x 10" 


-3 


-0.34% 


5.0 


6.0 


9.91 x 10" 


4 


1.01 x 10" 


-3 


6.40 x 10" 


4 


9.94 x 10" 


-4 


-0.30% 


5.5 


5.5 


2.11 x 10" 


3 


2.16 x 10" 


-3 


1.57 x 10" 


3 


2.11 x 10" 


-3 


0.00% 


5.5 


6.0 


4.45 x 10" 


4 


4.56 x 10" 


-4 


2.69 x 10" 


4 


4.47 x 10" 


-4 


-0.45% 


6.0 


6.0 


1.21 x 10" 


4 


1.24 x 10" 


-4 


6.14 x 10" 


5 


1.21 x 10" 


-4 


0.00% 



Similarly to the unconditional case, in the case of unit lattice variables, we have 

c(t\ = ft f Wj ~ -jfe-zij dT J \ rr ^i- 

[> 11 V 2sinh(r,/2) dw 3 ) }\ Aw . ' 

3- L J-da + L (w do ,w_ cio ) 

Other analytic properties and formulae still hold. 



5. Five examples 

We present five examples here. The fourth example is based on real data. 

In the first example, we consider the bivariate random vector (Yi,!^), with Y\ = 
X\ + X 2 and Y 2 = X 2 + X 3 , where X\, X 2 and X 3 are independent and identically 
distributed random variables following the exponential distribution, which has a density 
function f(x) =c~ x for x > 0. The results for approximating P{Y\ > yi,Y 2 > y 2 ) when 
n = 5 are listed in Table 1, where "P. approx." stands for the saddlepoint approximation 
proposed in this paper, "K. approx." stands for the saddlepoint approximation presented 
in [5] and "N. approx." stands for bivariate normal approximation. The "exact" column 
shows the exact tail probability values computed in [15]. The "relative error" column 
shows the relative error of "P. approx." The results for the cases (y~i,y 2 ) = (2.5,3.0) and 
{yiiVz) = (3.0,4.0) are the special cases where w\ = 0, which we have mentioned, but 
which are omitted here because of the removable singularity. The normal approximation 
deteriorates at the far tail, while both saddlepoint approximations show much better and 
more stable relative errors. In almost all cases, the new method shows smaller relative 
errors than those in [5]. 

In the second example, we consider the bivariate random vector (Yi,Y 2 ), with Y\ = 
X\ + X 2 and Y 2 = X 2 + X$, where X\, X 2 and X3 are independent and identically dis- 
tributed random variables following the binomial distribution, which has a mass function 
{ 1 x)p X 0- ~p) N ~ x fa? <x < N. The results for approximating P(Yi > y~i,Y 2 > y 2 ) when 
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iV = 10,p = 0.2 and n = 8 are displayed in Table 2. We can again see from the table that 
the normal approximation (with adjustment for continuity) deteriorates at the far tail, 
while the saddlepoint approximations show much better and more stable relative errors. 
In most cases, the new approximation shows better accuracy than that of [5]. 

The third example involves conditional distribution functions. Let Xi, i = 1,2,3, be 
independent and identically distributed random variables following the exponential dis- 
tribution, as in the first example. Consider the random vector (Y\,Y2, Y3) with Y\ = X2, 
Y2 = X s and Y 3 = Xi + X 2 + X 3 . The results for approximating P(Yi > yi, Y2 > y2\Y$ = 
i/3) when n = 10 are shown below in Table 3. The case where y\ = 2.0, y 2 = 2.5 and 
g/3 = 7.0 is the special case where both T2(0) = and w 2 = 0, as discussed in Section 3, 
and is omitted here. The cases where yi = 2.0, y 2 — 3.0 and g/3 = 7.0, and j/i = 2.0, 
j/2 = 2.5 and j/3 = 6.5, are the cases where Wi = 0; these are also omitted. The exact 
values arc computed in [15]. 

The fourth example was used in [5] and [6], which refers to data presented in [12]. The 
data consist of 63 case-control pairs of women with endometrial cancer. The relation- 
ship between the occurrence of endometrial cancer and explanatory variables including 



Table 3. Results of saddlepoint approximation compared with bivariate 
normal approximation in the conditional continuous case 



yi 


2/2 


J/3 


P. approx. 




N. approx. 


Exact 




Relative error 


2.0 


2.0 


7.0 


4.42 x 10" 


1 


8.04 x 10" 


-2 


4.38 x 10" 


-1 


0.91% 


2.5 


2.5 


7.0 


6.25 x 10" 


2 


2.04 x 10" 


-2 


6.32 x 10" 


-2 


-1.11% 


2.5 


3.0 


7.0 


8.00 x 10" 


3 


4.14 x 10" 


■5 


8.54 x 10" 


■3 


-6.32% 


3.0 


3.0 


7.0 


3.02 x 10" 


4 


1.00 x 10" 


■8 


3.46 x 10" 


-4 


-12.7% 


2.0 


2.0 


6.5 


2.93 x 10" 


1 


1.16 x 10" 


■1 


2.91 x 10" 


-1 


0.69% 


2.0 


3.0 


6.5 


1.09 x 10" 


2 


6.48 x 10" 


-5 


1.14 x 10" 


-2 


-4.39% 


2.5 


2.5 


6.5 


1.49 x 10" 


2 


6.96 x 10" 


-4 


1.56 x 10" 


-2 


-4.49% 


2.5 


3.0 


6.5 


5.25 x 10" 


4 


1.57 x 10" 


-7 


6.09 x 10" 


-4 


-13.8% 


3.0 


3.0 


6.5 


9.63 x 10" 


7 


3.67 x 10" 


-12 


1.10 x 10" 


■6 


12.5% 



Table 4. Differences between cases and controls for endometrial can- 
cer data 



Gall bladder disease 


-1 


-1 


-1 

















Hypertension 


-1 





1 


-1 


-1 








1 


Non-estrogen drug use 





-1 





-1 








1 





Number of pairs 


1 


1 


1 


2 


(i 


14 


10 


12 


Gall bladder disease 





1 


1 


1 


1 


1 


1 


1 


Hypertension 


1 


-1 


-1 











1 


1 


Non-estrogen drug use 


1 





1 


-1 





1 





1 


Number of pairs 


4 


3 


1 


1 


4 


1 


1 


1 
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gall bladder disease, hypertension and non-estrogen drug use is modeled with logistic 
regression. [12] noted that the likelihood for these data is equivalent to that of a logistic 
regression in which the units of observation are the matched pairs, the explanatory vari- 
ables are those of the case member minus those of the control member and the response 
variable is 1. 

The number of pairs with each configuration of differences of the three variables are 
shown in Table 4. Let Zj, j = 1, 2, . . . , 63 denote the differences of covariates between cases 
and controls, as given in Table 4. Consider the situation under the null hypothesis, where 
the linear coefficients are zero. Let Zj, j = 1, 2, . . . , 63, be the random vectors that take 
value Zj with a probability of ^ and with a probability of A . Let Z be matrix whose rows 
are Zj and where T = Z'l, where 1 is a column vector with dimension 63. We then have 
K(t) = J2j m j [log( 1+cx P( z J' r ) )] , [6] tested the association of hypertension or non-estrogen 
drug use with an increase in endometrial cancer, conditional on the sufficient statistic 
value associated with gall bladder disease. The test required evaluating the quantity 
P(T 2 > 10 or T 3 > 13|Tj = 9) for T = (T U T 2 ,T 3 ). By Boole's law, this probability can be 
computed using 

P(T 2 > 10|Ti = 9) + P(T 3 > 13|Ti = 9) - P{T 2 > 10, T 3 > 13|T X = 9). 

The results for approximating P(T 2 > 10, T 3 > 13|7i = 9) compared to those listed 
in [6] are shown in Table 5, where "N. app." stands for normal approximation, "E. 
app." stands for Edgeworth approximation, "K. app." stands for the approximation 
presented in [6] and "P. app." is the proposed approximation. Approximation re- 
sults of P(T 2 > t 2 ,Ts > i 3 |Ti = 9) for other values of t 2 and i 3 are also listed in 
the table. We can see that the proposed method achieves better results than other 
methods, except for the method of [6], which is far more complicated computation- 
ally. 

In the fifth example, we consider a multivariate gamma distribution, which is the 
diagonal of a Wishart distribution, formed from a 3-variatc normal distribution, with 
covariance matrix 

/ 1 0.25 0.25\ 
V= 0.25 1 0.25 
\0.25 0.25 1 / 



Table 5. Endometrial cancer results for some (£2,^3) instances 



Method 


(10,13) 




(9,12) 




(8,11) 




(7,10) 




(6,9) 




N. app. 


3.50 x 10" 


-4 


1.78 x 10" 


-3 


7.26 x 10" 


■3 


2.39 x 10" 


■2 


6.39 x 10" 


-2 


E. app. 


3.31 x 10" 


-4 


1.72 x 10" 


-3 


7.13 x 10" 


■3 


2.37 x 10" 


■2 


6.37 x 10" 


-2 


K. app. 


1.51 x 10" 


-4 


1.07 x 10" 


-3 


5.37 x 10" 


-3 


2.01 x 10" 


2 


5.84 x 10" 


-2 


P. app. 


1.62 x 10" 


-4 


1.13 x 10" 


-3 


5.60 x 10" 


-3 


2.08 x 10" 


2 


6.00 x 10" 


-2 


Exact 


1.52 x 10" 


-4 


1.09 x 10" 


-3 


5.48 x 10" 


-3 


2.05 x 10" 


2 


5.95 x 10" 


-2 
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Table 6. Results of saddlepoint approximation compared with normal approximations 
for a multivariate gamma distribution 



Hi 


2/2 


2/3 


P. approx. 




N. approx. 


Simulation 


Std. err. 




Relative erroi 


5.5 


5.5 


5.5 


4.93 x 10" 


2 


9.64 x 10" 


■2 


5.58 x 10" 


2 


1.42 x 10" 


-3 


-11.6% 


5.5 


5.5 


6.5 


3.65 x 10" 


2 


7.17 x 10" 


■2 


4.16 x 10" 


2 


1.24 x 10" 


-3 


-12.3% 


5.5 


6.5 


6.5 


2.70 x 10" 


2 


5.34 x 10" 


■2 


3.26 x 10" 


2 


1.10 x 10" 


-4 


-17.2% 


6.5 


6.5 


6.5 


1.98 x 10" 


2 


3.99 x 10" 


■2 


2.45 x 10" 


2 


9.58 x 10" 


-4 


-19.2% 


6.5 


6.5 


7.5 


1.45 x 10" 


2 


2.78 x 10" 


■2 


1.83 x 10" 


2 


8.31 x 10" 


-4 


-20.8% 


6.5 


7.5 


7.5 


1.05 x 10" 


2 


1.94 x 10" 


2 


1.34 x 10" 


2 


7.11 x 10" 


-4 


-21.6% 


7.5 


7.5 


7.5 


0.76 x 10" 


2 


1.36 x 10" 


3 


1.04 x 10" 


3 


1.89 x 10" 


-4 


-26.9% 



and n = 5. The results are listed in Table 6, where "P. approx." stands for the sad- 
dlepoint approximation proposed in this paper and "N. approx." stands for bivariate 
normal approximation. The "simulation" and "std. err." column shows the simulation 
results and 5% standard error. The "relative error" column shows the relative error 
of "P. approx." compared with the simulation results. We can see that the proposed 
approximation performs better than the normal approximation. 
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